智能论文笔记

Offline Meta-Reinforcement Learning for Industrial Insertion

Tony Z. Zhao , Jianlan Luo , Oleg Sushkov , Rugile Pevceviciute , Nicolas Heess , Jon Scholz , Stefan Schaal , Sergey Levine

分类：机器人

2021-10-08

强化学习（RL）原则上可以让机器人自动适应新任务，但是当前的RL方法需要大量的试验来实现这一目标。在本文中，我们通过元学习的框架来快速适应新任务，该框架利用过去的任务学习适应了对工业插入任务的特定关注。快速适应至关重要，因为大量的机器人试验可能会损害硬件件。另外，在不同的插入应用之间的经验中，有效的适应性也可以在很大程度上彼此利用。在这种情况下，我们在应用元学习时解决了两个具体的挑战。首先，传统的元元算法需要冗长的在线元训练。 We show that this can be replaced with appropriately chosen offline data, resulting in an offline meta-RL method that only requires demonstrations and trials from each of the prior tasks, without the need to run costly meta-RL procedures online.其次，元RL方法可能无法推广到与元训练时间时看到的新任务太大的任务，这在高成功率至关重要的工业应用中构成了特定的挑战。我们通过将上下文元学习与直接在线填充结合结合来解决这一问题：如果新任务与先前数据中看到的任务相似，则可以立即适应上下文的元学习者，如果它太不同，它会逐渐通过Finetuning适应。我们表明，我们的方法能够快速适应各种不同的插入任务，成功率为100％仅使用从头开始学习任务所需的样本的一小部分。实验视频和详细信息可从https://sites.google.com/view/offline-metarl-insertion获得。

translated by 谷歌翻译

Surveillance Face Anti-spoofing

Hao Fang , Ajian Liu , Jun Wan , Sergio Escalera , Chenxu Zhao , Xu Zhang , Stan Z. Li , Zhen Lei

分类：计算机视觉

2023-01-03

Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, recent research generally focuses on short-distance applications (i.e., phone unlocking) while lacking consideration of long-distance scenes (i.e., surveillance security checks). In order to promote relevant research and fill this gap in the community, we collect a large-scale Surveillance High-Fidelity Mask (SuHiFiMask) dataset captured under 40 surveillance scenes, which has 101 subjects from different age groups with 232 3D attacks (high-fidelity masks), 200 2D attacks (posters, portraits, and screens), and 2 adversarial attacks. In this scene, low image resolution and noise interference are new challenges faced in surveillance FAS. Together with the SuHiFiMask dataset, we propose a Contrastive Quality-Invariance Learning (CQIL) network to alleviate the performance degradation caused by image quality from three aspects: (1) An Image Quality Variable module (IQV) is introduced to recover image information associated with discrimination by combining the super-resolution network. (2) Using generated sample pairs to simulate quality variance distributions to help contrastive learning strategies obtain robust feature representation under quality variation. (3) A Separate Quality Network (SQN) is designed to learn discriminative features independent of image quality. Finally, a large number of experiments verify the quality of the SuHiFiMask dataset and the superiority of the proposed CQIL.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Non-equispaced Fourier Neural Solvers for PDEs

Haitao Lin , Lirong Wu , Yongjie Xu , Yufei Huang , Siyuan Li , Guojiang Zhao , Stan Z , Li Cari

分类：机器学习

2022-12-09

Solving partial differential equations is difficult. Recently proposed neural resolution-invariant models, despite their effectiveness and efficiency, usually require equispaced spatial points of data. However, sampling in spatial domain is sometimes inevitably non-equispaced in real-world systems, limiting their applicability. In this paper, we propose a Non-equispaced Fourier PDE Solver (\textsc{NFS}) with adaptive interpolation on resampled equispaced points and a variant of Fourier Neural Operators as its components. Experimental results on complex PDEs demonstrate its advantages in accuracy and efficiency. Compared with the spatially-equispaced benchmark methods, it achieves superior performance with $42.85\%$ improvements on MAE, and is able to handle non-equispaced data with a tiny loss of accuracy. Besides, to our best knowledge, \textsc{NFS} is the first ML-based method with mesh invariant inference ability to successfully model turbulent flows in non-equispaced scenarios, with a minor deviation of the error on unseen spatial points.

translated by 谷歌翻译

Teaching Yourself: Graph Self-Distillation on Neighborhood for Node Classification

Lirong Wu , Jun Xia , Haitao Lin , Zhangyang Gao , Zicheng Liu , Guojiang Zhao , Stan Z. Li

分类：机器学习

2022-10-05

Recent years have witnessed great success in handling graph-related tasks with Graph Neural Networks (GNNs). Despite their great academic success, Multi-Layer Perceptrons (MLPs) remain the primary workhorse for practical industrial applications. One reason for this academic-industrial gap is the neighborhood-fetching latency incurred by data dependency in GNNs, which make it hard to deploy for latency-sensitive applications that require fast inference. Conversely, without involving any feature aggregation, MLPs have no data dependency and infer much faster than GNNs, but their performance is less competitive. Motivated by these complementary strengths and weaknesses, we propose a Graph Self-Distillation on Neighborhood (GSDN) framework to reduce the gap between GNNs and MLPs. Specifically, the GSDN framework is based purely on MLPs, where structural information is only implicitly used as prior to guide knowledge self-distillation between the neighborhood and the target, substituting the explicit neighborhood information propagation as in GNNs. As a result, GSDN enjoys the benefits of graph topology-awareness in training but has no data dependency in inference. Extensive experiments have shown that the performance of vanilla MLPs can be greatly improved with self-distillation, e.g., GSDN improves over stand-alone MLPs by 15.54\% on average and outperforms the state-of-the-art GNNs on six datasets. Regarding inference speed, GSDN infers 75X-89X faster than existing GNNs and 16X-25X faster than other inference acceleration methods.

translated by 谷歌翻译

Exploring Generative Neural Temporal Point Process

Haitao Lin , Lirong Wu , Guojiang Zhao , Pai Liu , Stan Z. Li

分类：机器学习

2022-08-03

时间点过程（TPP）通常用于模拟具有出现时间戳的异步事件序列，并由以历史影响为条件的概率模型揭示。尽管以前的许多作品通过最大程度地提高了TPP模型的“合适性”，但它们的预测性能不令人满意，这意味着模型产生的时间戳与真实的观察相距甚远。最近，诸如DENOTO扩散和得分匹配模型之类的深层生成模型通过证明其生成高质量样本的能力，在图像生成任务方面取得了巨大进展。但是，在事件发生在TPP的情况下，尚无完整而统一的作品来探索和研究生成模型的潜力。在这项工作中，我们尝试通过设计一个unified \ textbf {g} \ textbf {n} eural \ textbf {t} emporal \ emporal \ textbf {p} oint \ textbf {p} rocess {p} rocess（\ textsc {\ textsc { GNTPP}）模型探索其可行性和有效性，并进一步改善模型的预测性能。此外，在衡量历史影响方面，我们修改了细心的模型，这些模型总结了历史事件的影响，并以适应性的重新加权术语来考虑事件的类型关系和时间间隔。已经进行了广泛的实验，以说明\ textsc {gntpp}的预测能力的提高，并用一系列生成概率解码器，并从修订后的注意力中获得了绩效增长。据我们所知，这是第一批适应生成模型在完整的统一框架中并在TPP背景下研究其有效性的作品。我们的代码库包括第5.1.1节中给出的所有方法。5.1.1在\ url {https://github.com/bird-tao/gntpp}中打开。我们希望代码框架可以促进神经TPP的未来研究。

translated by 谷歌翻译

A Boosting Algorithm for Positive-Unlabeled Learning

Yawen Zhao , Mingzhe Zhang , Chenhao Zhang , Tony Chen , Nan Ye , Miao Xu

分类：机器学习 | 人工智能

2022-05-19

当仅积极（P）和未标记（U）数据可用时，正面标记（PU）学习涉及二进制分类问题。已经提出了许多基于线性模型和神经网络的PU方法。但是，仍然缺乏关于理论上增强风格算法如何使用P和U数据的研究。考虑到在某些情况下，当神经网络即使使用完全监督的数据也不能像增强算法一样好时，我们提出了一种新颖的增强PU学习算法：ADA-PU，ADA-PU与神经网络进行了比较。 ADA-PU遵循ADABOOST的一般过程，同时维护和更新了P数据的两个不同分布。在新更新的分布上学习了弱分类器后，仅使用PU数据估算最终集合的相应组合权重。我们证明，使用较小的基础分类器集，确保该方法可以保留增强算法的理论属性。在实验中，我们表明ADA-PU在基准PU数据集上优于神经网络。我们还研究了网络安全性的现实世界数据集UNSW-NB15，并证明ADA-PU在恶意活动检测方面具有出色的性能。

translated by 谷歌翻译

Misinformation Detection in Social Media Video Posts

Kehan Wang , David Chan , Seth Z. Zhao , John Canny , Avideh Zakhor

分类：计算机视觉

2022-02-15

随着社交媒体平台越来越多地采用了简短的视频，通过视频帖子减少错误信息的传播已成为社交媒体提供商的关键挑战。在本文中，我们开发了在社交媒体帖子中检测错误信息的方法，从而利用了视频和文本等方式。由于缺乏在多模式数据集中检测错误信息检测的大规模公共数据，因此我们从Twitter收集160,000个视频帖子，并利用自学学习的学习来学习联合视觉和文本数据的表达性表示。在这项工作中，我们提出了两种新方法，用于基于对比度学习和掩盖语言建模的短形式社交媒体视频帖子中的语义不一致。我们证明，我们的新方法在通过随机交汇正面样本和在野外的新手动标记测试集中，在野外生成的人工数据上的最新方法都超过了当前的最新方法，以进行语义错误信息。

translated by 谷歌翻译

Silicon photonic subspace neural chip for hardware-efficient deep learning

Chenghao Feng , Jiaqi Gu , Hanqing Zhu , Zhoufeng Ying , Zheng Zhao , David Z. Pan , Ray T. Chen

分类：机器学习

2021-11-11

由于深度学习在许多人工智能应用中显示了革命性的性能，其升级的计算需求需要用于巨大并行性的硬件加速器和改进的吞吐量。光学神经网络（ONN）是下一代神经关键组成的有希望的候选者，由于其高并行，低延迟和低能量消耗。在这里，我们设计了一个硬件高效的光子子空间神经网络（PSNN）架构，其针对具有比具有可比任务性能的前一个ONN架构的光学元件使用，区域成本和能量消耗。此外，提供了一种硬件感知培训框架，以最小化所需的设备编程精度，减少芯片区域，并提高噪声鲁棒性。我们在实验上展示了我们的PSNN在蝴蝶式可编程硅光子集成电路上，并在实用的图像识别任务中显示其实用性。

translated by 谷歌翻译

IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning

Pan Lu , Liang Qiu , Jiaqi Chen , Tony Xia , Yizhou Zhao , Wei Zhang , Zhou Yu , Xiaodan Liang , Song-Chun Zhu

分类：计算机视觉 | 人工智能 | 自然语言处理 | 机器学习

2021-10-25

目前的视觉问题应答（VQA）任务主要考虑回答自然图像的人为注释问题。然而，除了自然图像之外，在视觉理解和推理研究中仍然可以解读具有语义丰富性的抽象图。在这项工作中，我们介绍了ICON问题的新挑战（ICONQA），其目标是在图标图像上下文中回答问题。我们发布了ICONQA，这是一个由107,439个问题和三个子任务组成的大型数据集：多图像选择，多文本选择和填充空白。 ICONQA数据集是由真实世界图中的启发，突出了抽象图理解和综合认知推理的重要性。因此，ICONQA不仅需要对象识别和文本理解等感知技能，而且还需要多种认知推理技能，例如几何推理，致辞推理和算术推理。为了促进潜在的iconqa模型来学习图标图像的语义表示，我们进一步发布了一个图标数据集图标645，其中包含377级上的645,687个彩色图标。我们进行广泛的用户研究和盲目实验，并重现各种先进的VQA方法来基准iconQA任务。此外，我们开发了一个强大的ICONQA基线Patch-TRM，它应用金字塔跨模型变压器，其中包含在图标数据集上预先培训的输入图嵌入式。 iconqa和图标645可在https://iconqa.github.io提供。

translated by 谷歌翻译